Evolving Fuzzy Prototypes for Efficient Data Clustering
نویسندگان
چکیده
number of prototypes used to represent each class, the position of each prototype within its class and the membership function associated with each prototype. This paper proposes a novel, evolutionary approach to data clustering and classification which overcomes many of the limitations of traditional systems. The approach rests on the optimisation of both the number and positions of fuzzy prototypes using a real-valued genetic algorithm (GA). Because the GA acts on all of the classes at once, the system benefits naturally from global information about possible class interactions. In addition, the concept of a receptive field for each prototype is used to replace the classical distance-based membership function by an infinite fuzzy support, multi-dimensional, Gaussian function centred over the prototype and with unique variance in each dimension, reflecting the tightness of the cluster. Hence, the notion of nearest-neighbour is replaced by that of nearest attracting prototype (NAP). The proposed model is a completely self-optimising, fuzzy system called GA-NAP. Most data clustering algorithms, including the popular K-means algorithm, require a priori knowledge about the problem domain to fix the number and starting positions of the prototypes. Although such knowledge may be assumed for domains whose dimensionality is fairly small or whose underlying structure is relatively intuitive, it is clearly much less accessible in hyperdimensional settings, where the number of input parameters may be very large. Classical systems also suffer from the fact that they can only define clusters for one class at a time. Hence, no account is made of potential interactions among classes. These drawbacks are further compounded by the fact that the ensuing classification is typically based on a fixed, distancebased membership function for all prototypes. This paper proposes a novel approach to data clustering and classification which overcomes the aforementioned limitations of traditional systems. The model is based on the genetic evolution of fuzzy prototypes. A realvalued genetic algorithm (GA) is used to optimise both the number and positions of prototypes. Because the GA acts on all of the classes at once and measures fitness as classification accuracy, the system naturally profits from global information about class interaction. The concept of a receptive field for each prototype is also presented and used to replace the classical, fixed distance-based function by an infinite fuzzy support membership function. The new membership function is inspired by that used in the hidden layer of RBF networks. It consists of a multi-dimensional Gaussian function centred over the prototype and with a unique variance in each dimension that reflects the tightness of the cluster. During classification, the notion of nearestneighbour is replaced by that of nearest attracting prototype (NAP). The proposed model is a completely self-optimising, fuzzy system called GA-NAP.
منابع مشابه
A Fuzzy Clustering and Fuzzy Merging Algorithm
Some major problems in clustering are: i) find the optimal number K of clusters; ii) assess the validity of a given clustering; iii) permit the classes to form natural shapes rather than forcing them into normed balls of the distance function; iv) prevent the order in which the feature vectors are read in from affecting the clustering; and v) prevent the order of merging from affecting the clus...
متن کاملShape Retrieval by Partially Supervised Fuzzy Clustering
In this work we propose the use of partially supervised fuzzy clustering to create a two-level indexing structure useful for enabling efficient shape retrieval. Similar shapes are grouped by a fuzzy clustering algorithm that embeds a partial supervision mechanism exploiting domain knowledge expressed in terms of a set of labeled shapes. After clustering, a set of prototypes representative of sh...
متن کاملGeneralized Fuzzy Clustering Method
This paper presents a new hybrid fuzzy clustering method. In the proposed method, cluster prototypes are values that minimize the introduced generalized cost function. The proposed method can be considered as a generalization of fuzzy c-means (FCM) method as well as the fuzzy c-median (FCMed) clustering method. The generalization of the cluster cost function is made by applying the Lp norm. The...
متن کاملWasserstein Metric Based Adaptive Fuzzy Clustering Methods for Symbolic Data
Given the current limitations in fuzzy clustering metric, the aim of this paper is to present new wasserstein metric based adaptive fuzzy clustering methods for partitioning symbolic interval data. Wasserstein metric shows adavantages in digging distribution information in symbolic interval data. Besides, the proposed fuzzy clustering methods also emphasize correlation structure between indices...
متن کاملAn Evolving Fuzzy-GARCH Approach for Financial Volatility Modeling and Forecasting
Volatility forecasting is a challenging task that has attracted the attention of market practitioners, regulators and academics in recent years. This paper proposes an evolving fuzzyGARCH approach to model and forecast the volatility of S&P 500 and Ibovespa indexes. The model comprises both the concept of evolving fuzzy systems and GARCH modeling approach in order to consider the principles of ...
متن کامل